31 results found.
Speech
Corpus,
Language Type:
Multilingual
Languages:
Arabic Bengali Dari English German Hindi Iranian Persian Japanese Korean Mandarin Chinese Persian Russian Spansih Standard Arabic Tamil Thai Vietnamese Yue Chinese
Availability:
From Owner
License:
LDC
Size:
66 hoursProduction Status:
Existing-used
Use:
Language Identification
-
Paper title:Modeling and training strategies for language recognition systems
-
Paper track:4.1 Language identification and verification, lang/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Raphaël Duroselle | 2007 NIST Language Recognition Evaluation Test Set | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Multilingual
Languages:
Amharic Bosnian Croatian Dari English French Georgian Haitian Hausa Hindi Korean Mandarin Chinese Persian Portuguese Pushto Russian Spanish Turkish Ukrainian Urdu Vietnamese Yue Chinese
Availability:
From Owner
License:
LDC
Size:
215 hoursProduction Status:
Existing-used
Use:
Language Identification
-
Paper title:Modeling and training strategies for language recognition systems
-
Paper track:4.1 Language identification and verification, lang/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Raphaël Duroselle | 2009 NIST Language Recognition Evaluation Test Set | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Multilingual
Languages:
Arabic Bengali Dari Egyptian Arabic English Georgian Hindi Iranian Persian Italian Japanese Khmer Korean Lao Mandarin Chinese Min Nan Chinese Moroccan Arabic Panjabi Persian Russian Spanish Tagalog Thai Tigrinya Urdu
Availability:
From Owner
License:
LDC
Size:
640 hoursProduction Status:
Existing-used
Use:
Language Identification
-
Paper title:Modeling and training strategies for language recognition systems
-
Paper track:4.1 Language identification and verification, lang/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Raphaël Duroselle | 2008 NIST Speaker Recognition Evaluation | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Monolingual
Languages:
Arabic Bengali Dari Egyptian Arabic English Georgian Hindi Iranian Persian Italian Japanese Khmer Korean Lao Mandarin Chinese Min Nan Chinese Moroccan Arabic Panjabi Persian Russian Spanish Tagalog Thai Tigrinya Urdu
Availability:
From Owner
License:
LDC
Size:
950 hoursProduction Status:
Existing-updated
Use:
Language Identification
-
Paper title:Modeling and training strategies for language recognition systems
-
Paper track:4.1 Language identification and verification, lang/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Raphaël Duroselle | 2008 NIST Speaker Recognition Evaluation Training Set Part 2 | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Multilingual
Languages:
Cantonese English French German Gishu Greek Gujarati Hebrew Hindi Indonesian Japanese Korean Mandarin Persian Portuguese Runyankore Russian Spanish Turkish Vietnamese
Availability:
Freely Available
License:
OpenSource
Size:
22.8 GByte Production Status:
Newly created-in progress
Use:
Speech Recognition/Understanding
-
Paper title:Speaking rate, information density, and information rate in first-language and second-language speech
-
Paper track:1.10 Bilingual and L2 acquisition and processing/Oral Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Ann Bradlow | The ALLSSTAR Corpus | /N |
Documentation:
Documentation in English is available to the public (via the project website)
Written
Grammar/Language Model,
Language Type:
Multilingual
Languages:
English Lithuanian Persian french
Availability:
Freely Available
License:
CC-BY-SA
Size:
1 MByte Production Status:
Newly created-in progress
Use:
Natural Language Generation
-
Paper title:GenDR: A Generic Deep Realizer with Complex Lexicalization
-
Paper track:Written
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | François Lareau | OLST, Université de Montréal | CA |
| Author 2 | Florie Lambrey | Université de Montréal | CA |
| Author 3 | Ieva Dubinskaite | Université de Montréal | CA |
| Author 4 | Daniel Galarreta-Piquette | Université de Montréal | CA |
| Author 5 | Maryam Nejat | Université de Montréal | CA |
| Main Contact | François Lareau | OLST, Université de Montréal | None |
Documentation:
Partial documentation in EnglishLanguage Type:
Multilingual
Languages:
English Persian
Availability:
<Not Specified>
License:
<Not Specified>
Size:
200000 sentences Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Extracting an English-Persian Parallel Corpus from Comparable Corpora
-
Paper track:Written
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Akbar Karimi | Institute for Advanced Studies in Basic Sciences (IASBS) | IR |
| Author 2 | Ebrahim Ansari | Institute for Advanced Studies in Basic Sciences (IASBS) | IR |
| Author 3 | Bahram Sadeghi Bigham | Department of Computer Sciences and Information Technology, Institute for Advanced Studies in Basic Sciences (IASBS) | IR |
| Main Contact | Ebrahim Ansari | Institute for Advanced Studies in Basic Sciences (IASBS) | None |
Documentation:
<Not Specified>
Written
Corpus,
Language Type:
Multilingual
Languages:
Persian
Availability:
is not yet released
License:
The project was fThe project was funded by Computer Research Center of Islamic Sciences (CRCIS)
Size:
<Not Specified> <Not Specified>Production Status:
Newly created-finished
Use:
Corpus Creation/Annotation
-
Paper title:Persian Discourse Treebank and coreference corpus
-
Paper track:Infrastructural Issues/Large Projects
-
Paper status:Accept PosterMerged
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Azadeh Mirzaei | Assistant Professor of Allameh Tabataba’i University | IR |
| Author 2 | Pegah Safari | MS in Artificial intelligence, Alzahra University | IR |
| Main Contact | Azadeh Mirzaei | Assistant Professor of Allameh Tabataba’i University | None |
Documentation:
<Not Specified>Language Type:
Multilingual
Languages:
Persian
Availability:
is not yet released
License:
The project was funded by Iran Information Technology Organization (ITO) and Computer Research Center of Islamic Sciences (CRCIS)
Size:
29982 sentences Production Status:
Newly created-finished
Use:
Discourse
-
Paper title:Persian Discourse Treebank and coreference corpus
-
Paper track:Infrastructural Issues/Large Projects
-
Paper status:Accept PosterMerged
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Azadeh Mirzaei | Assistant Professor of Allameh Tabataba’i University | IR |
| Author 2 | Pegah Safari | MS in Artificial intelligence, Alzahra University | IR |
| Main Contact | Azadeh Mirzaei | Assistant Professor of Allameh Tabataba’i University | None |
Documentation:
<Not Specified>
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Chinese Dutch French German Italian Mongolian Persian Russian Spanish Swedish Turkish
Availability:
Freely Available
License:
CC0
Size:
700 hours Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:CoVoST: A Diverse Multilingual Speech-To-Text Translation Corpus
-
Paper track:Speech/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Changhan Wang | CoVoST | /N |
Documentation:
https://github.com/facebookresearch/covost




